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4 <110> APPLICANT: Croce, Carlo M. 

6 <120> TITLE OF INVENTION: Nitrilase Homologs 

9 <130> FILE REFERENCE: CRO01.NP001 

11 <140> CURRENT APPLICATION NUMBER: 09/357, 675C 

12 <141> CURRENT FILING DATE: 1999-07*20 

14 <150> PRIOR APPLICATION NUMBER: 60/093; 350 

15 <151> PRIOR FILING DATE: 1998-07-20 
17 <160> NUMBER OF SEQ ID NOS : 31 

19 <170> SOFTWARE: FastSEQ for Windows Version 4.0 

21 <210> SEQ ID NO: 1 

22 <211> LENGTH: 1416 

23 <212> TYPE: DNA 

24 <213> ORGANISM: Homo sapiens 

26 <220> FEATURE: 

27 <221> NAME/KEY: misc_f eature . 

28 <222> LOCATION: (19)... (19) 

29 <223> OTHER INFORMATION: n-a 
<400> SEQUENCE: 1 
gcccactcgc tgcggcctnt ctggctccag 

3 3 ttttggctat atcttcatgt aggacctact 

34 tcaccaggcc tcctcacaga ttcctgtccc 

35 tctcagtact ttgtgctcag cccaggccca 

36 aactgcccct ggtggctgtg tgccaggtaa 

37 aaacatgtgc tgagctggtt cgagaggctg 

38 ctgaggcatt tgacttcatt gcacgggacc 

39 tgggtgggaa acttttggaa gaatacaccc 

40 ccttgggtgg tttccatgag cgtggccaag 

41 gtcacgtgct gctgaacagc aaaggggcag 

42 gtgacgtaga gattccaggg caggggccta 

43 ccagtcttga gtcacctgtc agcacaccag 

44 acatgcggtt ccctgaactc tctctggcat 

45 atccttcagc ttttggatcc attacaggcc 

46 gtgctatcga aacccagtgc tatgtagtgg 

47 agagagcaag ttatggccac agcatggtgg 

48 gctctgaggg gccaggcctc tgccttgccc 

4 9 gccgacacct gcctgtgttc cagcaccgca 

50 cactgtctta agacttgact tctgtgagtt 

51 actatgagct agtgctcatg tgacttggag 

52 gagaaccttg actctcttga tggaacacag 

53 gagcttcacc tgaggtcaga ctgcagtttc 

54 tatttcatgg aaactgaagt tctgctgagg 

55 taatcataaa gtcaaaaaaa aaaaaaaaaa 
57 <210> SEQ ID NO: 2 



FNTERED 



accgccctcc 

ccctatcccg 
ttctgtgtcc 
gagccatggc 
catcgacgcc 
ccagactggg 
ctgcagagac 
agcttgccag 
actgggagca 
tagtggccac 
tgtgtgaaag 
caggcaagat 
tggctcaagc 
cagcccactg 
cagcagcaca 
tagacccctg 
gaatagacct 
ggcctgacct 
tagacctgcc 
gcaggatcca 
atgggctgct 
agaaaggtgg 
gctgagcagc 
aaaaaa 



ggatcggacc 

tcggccgcgg 
tggactccgg 
tatctcctct 
agacaagcaa 
tgcctgcctg 
gctacacctg 
ggaatgtgga 
gactcagaaa 
ttacaggaag 
caactctacc 
tggtctagct 
tggagcagag 
ggaggtgttg 
gtgtggacgc 
gggaacagtg 
caactatctg 
ctatggcaat 
cctcccaccc 
ggcacagctc 
tgggaaagaa 
aattttatat 
actggcattg 



ctgcgaatgg 

ctgggcttca 
atacctcaac 
tcctcctgcg 
cagaacttta 
gctttcctgc 
tctgaaccac 
ctctggctgt 
atctacaatt 
acacatctgt 
atgcctgggc 
gtctgctatg 
atacttacct 
ctgcgggccc 
caccatgaga 
gtggcccgct 
cgacagttgc 
ctgggtcacc 
ccaccctgcc 
ccctcacttg 
actttcacct 
agtcattgtt 
aaaaatataa 



60 

120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1416 
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58 
59 
60 
62 
63 



<211> LENGTH: 23 

<212> TYPE: DNA 

<213> ORGANISM: Homo sapiens 

<400> SEQUENCE: 2 

tctgaaactg cagtctgacc tea 



23 



65 <210> SEQ ID NO: 3 

66 <211> LENGTH: 21 

67 <212> TYPE: DNA 

68 <213> ORGANISM: Homo sapiens 

70 <400> SEQUENCE: 3 

71 caggcacagc tcccctcact t 21 

73 <210> SEQ ID NO: 4 

74 <211> LENGTH: 20 

75 <212> TYPE: DNA 

76 <213> ORGANISM: Homo sapiens 

78 <220> FEATURE: 

79 <221> NAME/KEY: misc_feature 

80 <222> LOCATION: (0)...(0) 

81 <223> OTHER INFORMATION: n=a, g, c or t 
83 <400> SEQUENCE: 4 



86 <210> SEQ ID NO: 5 

87 <211> LENGTH: 26 

88 <212> TYPE: DNA 

89 <213> ORGANISM: Homo sapiens 

91 <220> FEATURE: 

92 <221> NAME/KEY: misc_feature 

93 <222> LOCATION: (0)...(0) 

94 <223> OTHER INFORMATION: n=a,c,g, or t and y=c or t 
96 <400> SEQUENCE: 5 



99 <210> SEQ ID NO: 6 

100 <211> LENGTH: 21 

101 <212> TYPE: DNA 

102 <213> ORGANISM: Drosophila melanogaster 

104 <400> SEQUENCE: 6 

105 gcgcctttgt ggcctcgact g 21 

107 <210> SEQ ID NO: 7 

108 <211> LENGTH: 21 

109 <212> TYPE: DNA 

110 <213> ORGANISM: Drosophila melanogaster 

112 <400> SEQUENCE: 7 

113 cggtggcgga agttgtctgg t 21 

115 <210> SEQ ID NO: 8 

116 <211> LENGTH: 20 

117 <212> TYPE: DNA 

118 <213> ORGANISM: Caenorhabditis elegans 

120 <400> SEQUENCE: 8 

121 gtggcggctg ctcaaactgg 20 




20 




26 
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123 <210> SEQ ID NO: 9 

124 <211> LENGTH: 21 

125 <212> TYPE: DNA 

126 <213> ORGANISM: Caenorhabditis elegans 

128 <400> SEQUENCE: 9 

129 tcgcgacgat gaacaagtcg g 21 

131 <210> SEQ ID NO: 10 

132 <211> LENGTH: 19 

133 <212> TYPE: DNA 

134 <213> ORGANISM: Homo sapiens 

136 <400> SEQUENCE: 10 

137 gccctccgga tcggaccct 19 

139 <210> SEQ ID NO : 11 

140 <211> LENGTH: 20 

141 <212> TYPE: DNA 

142 <213> ORGANISM: Homo sapiens 

144 <400> SEQUENCE: 11 

145 gacctactcc ctatcccgtc 20 

147 <210> SEQ ID NO: 12 

148 <211> LENGTH: 21 

149 <212> TYPE: DNA 

150 <213> ORGANISM: Homo sapiens 

152 <400> SEQUENCE: 12 

153 gctgcgaagt gcacagctaa g 21 

155 <210> SEQ ID NO: 13 

156 <211> LENGTH: 24 

157 <212> TYPE: DNA 

158 <213> ORGANISM: Homo sapiens 

160 <400> SEQUENCE: 13 

161 aaactgaagc ctctttcctc tgac 24 

163 <210> SEQ ID NO: 14 

164 <211> LENGTH: 20 

165 <212> TYPE: DNA 

166 <213> ORGANISM: Homo sapiens 

168 <400> SEQUENCE: 14 

169 tgggcttcat caccaggcct 20 

171 <210> SEQ ID NO: 15 

172 <211> LENGTH: 22 

173 <212> TYPE: DNA 

174 <213> ORGANISM: Homo sapiens 

176 <400> SEQUENCE: 15 

177 ctgggctgag cacaaagtac tg 22 

179 <210> SEQ ID NO: 16 

180 <211> LENGTH: 21 

181 <212> TYPE: DNA 

182 <213> ORGANISM: Homo sapiens 

184 <400> SEQUENCE: 16 

185 gcttgtctgg cgtcgatgtt a 21 
187 <210> SEQ ID NO: 17 
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188 <211> LENGTH: 36 

189 <212> TYPE: DNA 

190 <213> ORGANISM: Homo sapiens 

192 <400> SEQUENCE: 17 

193 tgacgtcgac atatgtcaac tctagttaat accacg 36 

195 <210> SEQ ID NO: 18 

196 <211> LENGTH: 25 

197 <212> TYPE: DNA 

198 <213> ORGANISM: Homo sapiens 

200 <400> SEQUENCE : 18 

201 tgggtacctc gactagctta tgtcc 2 5 

203 <210> SEQ ID NO: 19 

204 <211> LENGTH: 147 

205 <212> TYPE: PRT 

206 <213> ORGANISM: Homo sapien 

208 <220> FEATURE: 

209 <223> OTHER INFORMATION: Xaa is an unknown amino acid 
211 <400> SEQUENCE: 19 



212 


Met 


Ser 


Phe 


Arg 


Phe 


Gly 


Gin 


His 


Leu He 


Lys 


Pro 


Ser 


Val 


Val 


Phe 


213 


1 








5 








10 










15 




214 


Leu 


Lys 


Thr 


Glu 


Leu 


Ser 


Phe 


Ala 


Leu Val 


Asn 


Arg 


Lys 


Pro 


Val 


Val 


215 








20 










25 








30 






216 


Pro 


Gly 


His 


Val 


Leu 


Val 


Cys 


Pro 


Leu Arg 


Pro 


Val 


Glu 


Arg 


Phe 


His 


217 






35 










40 








45 








218 


Asp 


Leu 


Arg 


Pro 


Asp 


Glu 


Val 


Ala 


Asp Leu 


Phe 


Gin 


Thr 


Thr 


Gin 


Arg 


219 




50 










55 








60 










220 


Val 


Gly 


Thr 


Val 


Val 


Glu 


Lys 


His 


Phe His 


Gly 


Thr 


Ser 


Leu 


Thr 


Phe 


221 


65 










70 








75 










80 


222 


Ser 


Xaa 


Gin 


Asp 


Gly 


Pro 


Glu 


Ala 


Gly Gin Thr Val 


Lys 


His 


Val 


His 


223 










85 








90 










95 




224 


Val 


His 


Val 


Leu 


Pro 


Arg 


Lys 


Ala 


Gly Asp 


Phe 


His 


Arg 


Asn 


Asp 


Ser 


225 








100 










105 








110 






226 


He 


Tyr 


Glu 


Glu 


Leu 


Gin 


Lys 


His 


Asp Lys 


Glu 


Asp 


Phe 


Pro 


Ala 


Ser 


227 






115 










120 








125 








228 


Trp 


Arg 


Ser 


Glu 


Glu 


Glu 


Glu 


Ala 


Ala Glu 


Ala 


Ala 


Ala 


Leu 


Arg 


Val 


229 




130 










135 








140 










230 


Tyr 


Phe 


Gin 



























231 145 

234 <210> SEQ ID NO: 20 

235 <211> LENGTH: 150 

236 <212> TYPE: PRT 

237 <213> ORGANISM: murine 
239 <400> SEQUENCE: 20 

24 0 Met Ser Phe Arg Phe Gly Gin His Leu He Lys Pro Ser Val Val Phe 

241 1 5 10 15 

242 Leu Lys Thr Glu Leu Ser Phe Ala Leu Val Asn Arg Lys Pro Val Val 

243 20 25 30 

244 Pro Gly His Val Leu Val Cys Pro Leu Arg Pro Val Glu Arg Phe Arg 

245 35 40 45 
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9 Afi 


Asp 


Leu 


U-i g 
nib 


rl <J 


1 cn 
rlop 




Va 1 

V aX 


Al a 

riia 


A en 




IT 11C 


OX 11 


Va 1 
v ax 


± 11X 


Olll 


A T*rr 
ril y 


9 A7 














55 










60 










9 A ft 
Z ft o 


VQl 


oxy 


1 XIX 


V d. X 


Va 1 
Vdl 


UlU 


ijy & 


n i o 




OX 11 


oxy 


± in 


OCX 


Tip 
X xc 


± in 


IT 11C 


0 4 Q 

z 4 y 


65 










10 










7 ^ 










O \J 


Z jU 


Ser 


Met 


Gin 


Asp 


Gly 


Drri 


V3J. U 


Ala 

t\ 1 CL 




OX 11 


Thr 


Val 


T.uc 
Xljf o 


His 


Val 


His 


9 










85 




















Q5 




9 S9 
Z J z 


Val 


His 


Val 


Leu 


Pro 


Arg 


Lys 


Ala 


oxy 


Asp 


Phe 


Pro 


Arg 


Asn 


Asp 


Asn 










100 






Lys 




1 05 

X U J 










110 

XXV 






9 RA 
Z Oft 


He 


Tyr 


Asp 


Glu 


Leu 


Gin 


III O 


A 


A T*rr 
r\i y 


m n 

ox u. 


uiu 


m ii 


A cri 


Cpr 
OC1 




ZJJ 






115 










1 90 

1ZU 










1 9S 

1ZJ 








9 Rfi 


Ala 


Phe 


Trp 


Arg 


Ser 


Glu 


Lys 


ox u 




Ala 


Ala 


Glu 


Ala 


Glu 


Ala 


Leu 


9 

ZD / 




130 










135 




















9 Sft 

Z JO 


Arg 


Val 


Tyr 


Phe 


Gin 


Ala 






















9 RQ 

Z J7 


145 










150 






















9fi9 
zoz 


<210> SEQ ID NO: 


: 21 
























zoo 


<211> LENGTH: 327 
























^ o *± 


<212> TYPE: 


PRT 


























Z D D 


<213> ORGANISM: 


Homo sapien 




















9 fi7 
ZD/ 


<400> SEQUENCE: 


9 1 
z X 
























9 £ Q 
ZOO 


Met 


Leu 


Gly 


Phe 


i ±e 


l nr 


Arg 


Jr 1 (J 


Jr X vj 


tin a 


Arg 


DVio 
xr lie 


T on 


OCX 


T.on 
XjCU. 


T.on 
Jjcu 


9£Q 


1 








5 










1 0 
X u 










1 S 
X — ' 




9 7ri 
z / U 


Cys 


Pro 


Gly 


Leu 


Arg 


He 


Pro 


OXll 


Leu 


OCX 


Va 1 
Vdl 


Leu 


^.y is 


Ala 


OX 11 


Jr X 


971 
Z / X 








20 










9^ 

z ~> 
















9 7 9 
Z / Z 


Arg 


Pro 


Arg 


Ala 


Met 


Ala 


He 


Got* 
OCX 


OCX 


Del 


OCX 




Oil 


T.on 


XT 1 (J 


T.on 
XjCU. 


z / o 






35 










AO 
ft U 










4S 








0 7/ 
Z / 4 


Val 


Ala 


Val 


Cys 


Gin 


Val 


Thr 


Ser 


Tnr 


Pro 


nop 


xiy o 


Pin 

OX 11 


OXIl 


A g *n 


Pho 

XT 11C 


07c 

Z / D 




50 










55 










fin 










9 9 C 
Z / 0 


Lys 


Thr 


Cys 


Ala 


Glu 


Leu 


Val 


Arg 






Ala 
nla 


Arg 


Leu 


oxy 


Ala 
nla 


^-y o 


Oil 
Z / / 


65 










70 




















80 


9 7 Q 
Z / O 


Leu 


Ala 


Phe 


Leu 


Pro 


Glu 


Ala 


DVio 

rim 


Asp 


flic 


lie 


Ala 
ril d 


A TTT 

r\.x y 


A c r~\ 
nbJJ 


t 1 LJ 


Ala 

nla 


9 7 Q 

z / y 










85 
























9 ftPl 
Z OU 


Glu 


Thr 


Leu 


His 


Leu 


Ser 


Glu 


rl \j 


T.on 
lieu. 




OX jr 


Ijy o 


T.on 

LlCU 




Ol Ul 


OX u. 


9R1 
ZOl 








100 










105 










110 






9 ft9 
ZOZ 


Tyr 


Thr 


Gin 


Leu 


Ala 


Arg 


Glu 


y o 


oiy 


T.on 


xrp 


T.on 
XjCLI 


Cor 
OCX 


T.oi i 

UCLl 


oxy 




ZOO 






115 










1 90 
Izu 










1 9S 

1Z J 








9 ft A 
Z Oft 


Phe 


His 


Glu 


Arg 


Gly 


Gin 


Asp 


irp 


Ol LL 
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OX 11 


J. ill 
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T.T7C 

Xijr o 


11C 
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A g n 

A Oil 
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Z O J 




130 
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Cys 
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Val 


Leu 


Leu 


Asn 


Ser 


Lys 


oxy 


Ala 


Va 1 

vax 


Va 1 
val 


Ala 
Ala 


Tnr 


Tyr 


Arg 


287 


145 










150 










155 










160 


288 


Lys 


Thr 


His 


Leu 


Cys 


Asp 


Val 


Glu 


He 


Pro 


Gly 


Gin 


Gly 


Pro 


Met 


Cys 


289 










165 










170 










175 




290 


Glu 


Ser 


Asn 


Ser 


Thr 


Met 


Pro 


Gly 


Pro 


Ser 


Leu 


Glu 


Ser 


Pro 


Val 


Ser 


291 








180 










185 










190 






292 


Thr 


Pro 


Ala 


Gly 


Lys 


He 


Gly 


Leu 


Ala 


Val 


Cys 


Tyr 


Asp 


Met 


Arg 


Phe 


293 






195 










200 










205 








294 


Pro 


Glu 


Leu 


Ser 


Leu 


Ala 


Leu 


Ala 


Gin 


Ala 


Gly 


Ala 


Glu 


He 


Leu 


Thr 


295 




210 










215 










220 










296 


Tyr 


Pro 


Ser 


Ala 


Phe 


Gly 


Ser 


He 


Thr 


Gly 


Pro 


Ala 


His 


Trp 


Glu 


Val 


297 


225 










230 










235 










240 
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Use of n's or Xaa' s (NEW RULES): 

Use of n's and/or Xaa's have been detected in the Sequence Listing. 

Use of <220> to <223> is MANDATORY if n's or Xaa's are present. 

in <220> to <223> section, please explain location of n or Xaa, and which 

residue n or Xaa represents* 

Seq#:l; N Pos . 19 
Seq#:4; N Pos. 3,6,9,12,18 
Seq#:5; N Pos. 6,15,18,24 
Seq#:19; Xaa Pos. 82 
Seq#:25; Xaa Pos. 6 
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